A 'Gibbs-Newton' Technique for Enhanced Inference of Multivariate Polya Parameters and Topic Models

نویسندگان

  • Osama Khalifa
  • David W. Corne
  • Mike J. Chantler
چکیده

Hyper-parameters play a major role in the learning and inference process of latent Dirichlet allocation (LDA). In order to begin the LDA latent variables learning process, these hyperparameters values need to be pre-determined. We propose an extension for LDA that we call ‘Latent Dirichlet allocation Gibbs Newton’ (LDA-GN), which places non-informative priors over these hyper-parameters and uses Gibbs sampling to learn appropriate values for them. At the heart of LDA-GN is our proposed ‘Gibbs-Newton’ algorithm, which is a new technique for learning the parameters of multivariate Polya distributions. We report Gibbs-Newton performance results compared with two prominent existing approaches to the latter task: Minka’s fixed-point iteration method and the Moments method. We then evaluate LDA-GN in two ways: (i) by comparing it with standard LDA in terms of the ability of the resulting topic models to generalize to unseen documents; (ii) by comparing it with standard LDA in its performance on a binary classification task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fast and Effective Framework for Lifelong Topic Model with Self-learning Knowledge

To discover semantically coherent topics from topic models, knowledge-based topic models have been proposed to incorporate prior knowledge into topic models. Moreover, some researchers propose lifelong topic models (LTM) to mine prior knowledge from topics generated from multi-domain corpus without human intervene. LTM incorporates the learned knowledge from multi-domain corpus into topic model...

متن کامل

Improved Bayesian Logistic Supervised Topic Models with Data Augmentation

Supervised topic models with a logistic likelihood have two issues that potentially limit their practical use: 1) response variables are usually over-weighted by document word counts; and 2) existing variational inference methods make strict mean-field assumptions. We address these issues by: 1) introducing a regularization constant to better balance the two parts based on an optimization formu...

متن کامل

Bayesian inference of genetic parameters for reproductive traits in Sistani native cows using Gibbs sampling

This study was undertaken to estimate the genetic parameters for some reproduction traits in Sistani beef cattle. The data set consisted of 1489 records of number of insemination, calving, and insemination dates in different calving was used. Reproduction traits including calving interval (CI), gestation length (GL), days open (DO), calving to first service (CTFS), first service to conception (...

متن کامل

Lognormal and Gamma Mixed Negative Binomial Regression

In regression analysis of counts, a lack of simple and efficient algorithms for posterior computation has made Bayesian approaches appear unattractive and thus underdeveloped. We propose a lognormal and gamma mixed negative binomial (NB) regression model for counts, and present efficient closed-form Bayesian inference; unlike conventional Poisson models, the proposed approach has two free param...

متن کامل

Bayesian Inference of (Co) Variance Components and Genetic Parameters for Economic Traits in Iranian Holsteins via Gibbs Sampling

The aim of this study was using Bayesian approach via Gibbs sampling (GS) for estimating genetic parameters of production, reproduction and health traits in Iranian Holstein cows. Data consisted of 320666 first- lactation records of Holstein cows from 7696 sires and 260302 dams collected by the animal breeding center of Iran from year 1991 to 2010. (Co) variance components were estimated using ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1510.06646  شماره 

صفحات  -

تاریخ انتشار 2015